stable module
Explainable assessment of financial experts' credibility by classifying social media forecasts and checking the predictions with actual market data
García-Méndez, Silvia, de Arriba-Pérez, Francisco, González-Gonzáleza, Jaime, González-Castaño, Francisco J.
Social media include diverse interaction metrics related to user popularity, the most evident example being the number of user followers. The latter has raised concerns about the credibility of the posts by the most popular creators. However, most existing approaches to assess credibility in social media strictly consider this problem a binary classification, often based on a priori information, without checking if actual real-world facts back the users' comments. In addition, they do not provide automatic explanations of their predictions to foster their trustworthiness. In this work, we propose a credibility assessment solution for financial creators in social media that combines Natural Language Processing and Machine Learning. The reputation of the contributors is assessed by automatically classifying their forecasts on asset values by type and verifying these predictions with actual market data to approximate their probability of success. The outcome of this verification is a continuous credibility score instead of a binary result, an entirely novel contribution by this work. Moreover, social media metrics (i.e., user context) are exploited by calculating their correlation with the credibility rankings, providing insights on the interest of the end-users in financial posts and their forecasts (i.e., drop or rise). Finally, the system provides natural language explanations of its decisions based on a model-agnostic analysis of relevant features.
Deep learning based Auto Tuning for Database Management System
Gunasekaran, Karthick Prasad, Tiwari, Kajal, Acharya, Rachana
The management of database system configurations is a challenging task, as there are hundreds of configuration knobs that control every aspect of the system. This is complicated by the fact that these knobs are not standardized, independent, or universal, making it difficult to determine optimal settings. An automated approach to address this problem using supervised and unsupervised machine learning methods to select impactful knobs, map unseen workloads, and recommend knob settings was implemented in a new tool called OtterTune and is being evaluated on three DBMSs, with results demonstrating that it recommends configurations as good as or better than those generated by existing tools or a human expert.In this work, we extend an automated technique based on Ottertune [1] to reuse training data gathered from previous sessions to tune new DBMS deployments with the help of supervised and unsupervised machine learning methods to improve latency prediction. Our approach involves the expansion of the methods proposed in the original paper. We use GMM clustering to prune metrics and combine ensemble models, such as RandomForest, with non-linear models, like neural networks, for prediction modeling.
Development of Machine learning algorithms to identify the Cobb angle in adolescents with idiopathic scoliosis based on lumbosacral joint efforts during gait (Case study)
Samadi, Bahare, Raison, Maxime, Mahaudens, Philippe, Detrembleur, Christine, Achiche, Sofiane
Objectives: To quantify the magnitude of spinal deformity in adolescent idiopathic scoliosis (AIS), the Cobb angle is measured on X-ray images of the spine. Continuous exposure to X-ray radiation to follow-up the progression of scoliosis may lead to negative side effects on patients. Furthermore, manual measurement of the Cobb angle could lead to up to 10{\deg} or more of a difference due to intra/inter observer variation. Therefore, the objective of this study is to identify the Cobb angle by developing an automated radiation-free model, using Machine learning algorithms. Methods: Thirty participants with lumbar/thoracolumbar AIS (15{\deg} < Cobb angle < 66{\deg}) performed gait cycles. The lumbosacral (L5-S1) joint efforts during six gait cycles of participants were used as features to feed training algorithms. Various regression algorithms were implemented and run. Results: The decision tree regression algorithm achieved the best result with the mean absolute error equal to 4.6{\deg} of averaged 10-fold cross-validation. Conclusions: This study shows that the lumbosacral joint efforts during gait as radiation-free data are capable to identify the Cobb angle by using Machine learning algorithms. The proposed model can be considered as an alternative, radiation-free method to X-ray radiography to assist clinicians in following-up the progression of AIS.
F1 to F-beta
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. The F-1 score is a popular binary classification metric representing a balance between precision and recall. It is the Harmonic mean of precision and recall.
ROC and AUC for Model Evaluation
ROC or Receiver Operating Characteristic Curve is the most frequently used tool for evaluating the binary or multi-class classification model. Unlike other metrics, it is calculated on prediction scores like Precision-Recall Curve instead of prediction class. In my previous post, the importance of the precision-recall curve is highlighted as how to plot for multi-class classification. To understand ROC Curve, let's quickly refresh our memory on the possible outcomes in a binary classification problem by referring to the Confusion Matrix. ROC Curve is a plot of True Positive Rate(TPR) plotted against False Positive Rate(FPR) at various threshold values. It helps to visualize how threshold affects classifier performance.
Introduction to Confusion Matrix
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. The Confusion Matrix is the visual representation of the Actual VS Predicted values.
Top 12 Machine Learning Algorithms You Should Know to Become a Data Scientist
Let's say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question "Which fruits are red and round?" and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won't be red and round. So I will ask a question "Which fruits have red or yellow color hints on them? " on red and round fruits and will ask "Which fruits are green and round?" on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition.
The 10 Algorithms Data Scientist must have to Know.
Let's say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question "Which fruits are red and round?" and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won't be red and round. So I will ask a question "Which fruits have red or yellow color hints on them? " on red and round fruits and will ask "Which fruits are green and round?" on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition.
Ten Machine Learning Algorithms You Should Know to Become a Data Scientist
Let's say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question "Which fruits are red and round?" and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won't be red and round. So I will ask a question "Which fruits have red or yellow colour hints on them? " on red and round fruits and will ask "Which fruits are green and round?" on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition.
Ten Machine Learning Algorithms You Should Know to Become a Data Scientist - ParallelDots
Let's say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question "Which fruits are red and round?" and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won't be red and round. So I will ask a question "Which fruits have red or yellow color hints on them? " on red and round fruits and will ask "Which fruits are green and round?" on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition.